Picture for Eve Fleisig

Eve Fleisig

AI, Take the Wheel: What Drives Delegation and Trust in Human-Computer Cooperative Question Answering?

Add code
May 27, 2026
Viaarxiv icon

PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm

Add code
Jan 13, 2026
Viaarxiv icon

Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions

Add code
Sep 10, 2025
Figure 1 for Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Figure 2 for Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Figure 3 for Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Figure 4 for Balancing Quality and Variation: Spam Filtering Distorts Data Label Distributions
Viaarxiv icon

GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration

Add code
Feb 27, 2025
Figure 1 for GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Figure 2 for GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Figure 3 for GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Figure 4 for GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Viaarxiv icon

Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree

Add code
Oct 16, 2024
Figure 1 for Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Figure 2 for Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Figure 3 for Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Figure 4 for Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree
Viaarxiv icon

ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks

Add code
Jun 24, 2024
Figure 1 for ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks
Figure 2 for ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks
Figure 3 for ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks
Figure 4 for ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks
Viaarxiv icon

Standard Language Ideology in AI-Generated Language

Add code
Jun 13, 2024
Figure 1 for Standard Language Ideology in AI-Generated Language
Viaarxiv icon

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

Add code
Jun 13, 2024
Figure 1 for Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Figure 2 for Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Figure 3 for Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Figure 4 for Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
Viaarxiv icon

The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels

Add code
May 09, 2024
Viaarxiv icon

Mapping Social Choice Theory to RLHF

Add code
Apr 19, 2024
Figure 1 for Mapping Social Choice Theory to RLHF
Viaarxiv icon